Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Multi‑agent reinforcement learning based on attentional message sharing
Rong ZANG, Li WANG, Tengfei SHI
Journal of Computer Applications    2022, 42 (11): 3346-3353.   DOI: 10.11772/j.issn.1001-9081.2021122169
Abstract442)   HTML19)    PDF (1668KB)(198)       Save

Communication is an important way to achieve effective cooperation among multiple agents in a non? omniscient environment. When there are a large number of agents, redundant messages may be generated in the communication process. To handle the communication messages effectively, a multi?agent reinforcement learning algorithm based on attentional message sharing was proposed, called AMSAC (Attentional Message Sharing multi?agent Actor?Critic). Firstly, a message sharing network was built for effective communication among agents, and information sharing was achieved through message reading and writing by the agents, thus solving the problem of lack of communication among agents in non?omniscient environment with complex tasks. Then, in the message sharing network, the communication messages were processed adaptively by the attentional message sharing mechanism, and the messages from different agents were processed with importance order to solve the problem that large?scale multi?agent system cannot effectively identify and utilize the messages during the communication process. Moreover, in the centralized Critic network, the Native Critic was used to update the Actor network parameters according to Temporal Difference (TD) advantage policy gradient, so that the action values of agents were evaluated effectively. Finally, during the execution period, the decision was made by the agent distributed Actor network based on its own observations and messages from message sharing network. Experimental results in the StarCraft Multi?Agent Challenge (SMAC) environment show that compared with Native Actor?Critic (Native AC), Game Abstraction Communication (GA?Comm) and other multi?agent reinforcement learning methods, AMSAC has an average win rate improvement of 4 - 32 percentage points in four different scenarios. AMSAC’s attentional message sharing mechanism provides a reasonable solution for processing communication messages among agents in a multi?agent system, and has broad application prospects in both transportation hub control and unmanned aerial vehicle collaboration.

Table and Figures | Reference | Related Articles | Metrics